38 research outputs found
The Complexity of Reasoning with FODD and GFODD
Recent work introduced Generalized First Order Decision Diagrams (GFODD) as a
knowledge representation that is useful in mechanizing decision theoretic
planning in relational domains. GFODDs generalize function-free first order
logic and include numerical values and numerical generalizations of existential
and universal quantification. Previous work presented heuristic inference
algorithms for GFODDs and implemented these heuristics in systems for decision
theoretic planning. In this paper, we study the complexity of the computational
problems addressed by such implementations. In particular, we study the
evaluation problem, the satisfiability problem, and the equivalence problem for
GFODDs under the assumption that the size of the intended model is given with
the problem, a restriction that guarantees decidability. Our results provide a
complete characterization placing these problems within the polynomial
hierarchy. The same characterization applies to the corresponding restriction
of problems in first order logic, giving an interesting new avenue for
efficient inference when the number of objects is bounded. Our results show
that for formulas, and for corresponding GFODDs, evaluation and
satisfiability are complete, and equivalence is
complete. For formulas evaluation is complete, satisfiability
is one level higher and is complete, and equivalence is
complete.Comment: A short version of this paper appears in AAAI 2014. Version 2
includes a reorganization and some expanded proof
Tight Bounds for Active Self-Assembly Using an Insertion Primitive
We prove two tight bounds on the behavior of a model of self-assembling
particles introduced by Dabby and Chen (SODA 2013), called insertion systems,
where monomers insert themselves into the middle of a growing linear polymer.
First, we prove that the expressive power of these systems is equal to
context-free grammars, answering a question posed by Dabby and Chen. Second, we
prove that systems of monomer types can deterministically construct
polymers of length in expected
time, and that this is optimal in both the number of monomer types and expected
time.Comment: To appear in Algorithmica. An abstract (12-page) version of this
paper appeared in the proceedings of ESA 201
Use of reverse transcription-polymerase chain reaction (RT-PCR) for Cymbidium mosaic virus (CyMV) detection in orchids
The reverse transcription-polymerase chain reaction CRT-PCR) was
adapted for detection of Cymbidium mosaic virus CCyMV) in orchids.
The oligonucleotide primers used were selected from the predicted
homologous coat protein region of CyMV and other Potexviruses
which enabled to amplify approximately 313 bp and 227 bp fragments
using optimum reaction conditions of 2.5 mM MgCh and 30 cycles of
amplification. The RT-PCR allowed the detection of CyMV RNA and virion in
purified fonns as well as in crude tissue extracts of orchid. Direct
CyMV RNA detection was possible in leaves, shoots, stems, roots and
petals. The detection limits of RNA in purified CyMV and virion by
RT-PCR described were 10 ng and 2 ng, respectively. The PCR
amplified fragments were confinned to be CyMV-specific by dotblot
hybridization with DIG-labelled CyMV cDNA probe.
The suitability of the RT-PCR in routine testing of CyMV was
detennined and compared with those of DAS-ELISA. Thirty samples
of leaf tissues representing various genera or hybrids of cultivated
local orchid from glasshouse and commercial nurseries were tested
for CyMV by RT-PCR and DAS-ELISA. Among 15 samples that
tested positive for CyMV infection by DAS-ELISA, only 7 samples
gave the expected amplification fragments when subjected in RTPCR
assays. The equal detection limit on purified CyMV virion by
RT-PCR and DAS-ELISA and lower sensitivity of RT-PCR in
detecting CyMV in a field indexing trial suggested that RT-PCR is
unsuitable to replace DAS-ELISA for routine testing of CyMV in
local orchids
Going the distance for protein function prediction: a new distance metric for protein interaction networks
Due to an error introduced in the production process, the x-axes in the first panels of Figure 1 and Figure 7 are not formatted correctly. The correct Figure 1 can be viewed here: http://dx.doi.org/10.1371/annotation/343bf260-f6ff-48a2-93b2-3cc79af518a9In protein-protein interaction (PPI) networks, functional similarity is often inferred based on the function of directly interacting proteins, or more generally, some notion of interaction network proximity among proteins in a local neighborhood. Prior methods typically measure proximity as the shortest-path distance in the network, but this has only a limited ability to capture fine-grained neighborhood distinctions, because most proteins are close to each other, and there are many ties in proximity. We introduce diffusion state distance (DSD), a new metric based on a graph diffusion property, designed to capture finer-grained distinctions in proximity for transfer of functional annotation in PPI networks. We present a tool that, when input a PPI network, will output the DSD distances between every pair of proteins. We show that replacing the shortest-path metric by DSD improves the performance of classical function prediction methods across the board.MC, HZ, NMD and LJC were supported in part by National Institutes of Health (NIH) R01 grant GM080330. JP was supported in part by NIH grant R01 HD058880. This material is based upon work supported by the National Science Foundation under grant numbers CNS-0905565, CNS-1018266, CNS-1012910, and CNS-1117039, and supported by the Army Research Office under grant W911NF-11-1-0227 (to MEC). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript
A multi-species functional embedding integrating sequence and network structure
A key challenge to transferring knowledge between species is that different species have fundamentally different genetic architectures. Initial computational approaches to transfer knowledge across species have relied on measures of heredity such as genetic homology, but these approaches suffer from limitations. First, only a small subset of genes have homologs, limiting the amount of knowledge that can be transferred, and second, genes change or repurpose functions, complicating the transfer of knowledge. Many approaches address this problem by expanding the notion of homology by leveraging high-throughput genomic and proteomic measurements, such as through network alignment. In this work, we take a new approach to transferring knowledge across species by expanding the notion of homology through explicit measures of functional similarity between proteins in different species. Specifically, our kernel-based method, HANDL (Homology Assessment across Networks using Diffusion and Landmarks), integrates sequence and network structure to create a functional embedding in which proteins from different species are embedded in the same vector space. We show that inner products in this space and the vectors themselves capture functional similarity across species, and are useful for a variety of functional tasks. We perform the first whole-genome method for predicting phenologs, generating many that were previously identified, but also predicting new phenologs supported from the biological literature. We also demonstrate the HANDL embedding captures pairwise gene function, in that gene pairs with synthetic lethal interactions are significantly separated in HANDL space, and the direction of separation is conserved across species. Software for the HANDL algorithm is available at http://bit.ly/lrgr-handl.Published versio
Functional protein representations from biological networks enable diverse cross-species inference
Partial funding for Open Access provided by the UMD Libraries' Open Access Publishing Fund.Transferring knowledge between species is key for
many biological applications, but is complicated
by divergent and convergent evolution. Many current
approaches for this problem leverage sequence
and interaction network data to transfer knowledge
across species, exemplified by network alignment
methods. While these techniques do well, they are
limited in scope, creating metrics to address one
specific problem or task. We take a different approach
by creating an environment where multiple
knowledge transfer tasks can be performed using
the same protein representations. Specifically, our
kernel-based method, MUNK, integrates sequence
and network structure to create functional protein
representations, embedding proteins from different
species in the same vector space. First we show
proteins in different species that are close in MUNKspace
are functionally similar. Next,we use these representations
to share knowledge of synthetic lethal
interactions between species. Importantly, we find
that the results using MUNK-representations are at
least as accurate as existing algorithms for these
tasks. Finally, we generalize the notion of a phenolog
(‘orthologous phenotype’) to use functionally similar
proteins (i.e. those with similar representations). We
demonstrate the utility of this broadened notion by
using it to identify known phenologs and novel non-obvious
ones supported by current research
Assessment of network module identification across complex diseases
Many bioinformatics methods have been proposed for reducing the complexity of large gene or protein networks into relevant subnetworks or modules. Yet, how such methods compare to each other in terms of their ability to identify disease-relevant modules in different types of network remains poorly understood. We launched the 'Disease Module Identification DREAM Challenge', an open competition to comprehensively assess module identification methods across diverse protein-protein interaction, signaling, gene co-expression, homology and cancer-gene networks. Predicted network modules were tested for association with complex traits and diseases using a unique collection of 180 genome-wide association studies. Our robust assessment of 75 module identification methods reveals top-performing algorithms, which recover complementary trait-associated modules. We find that most of these modules correspond to core disease-relevant pathways, which often comprise therapeutic targets. This community challenge establishes biologically interpretable benchmarks, tools and guidelines for molecular network analysis to study human disease biology